Using Decision Tree Classifiers in Source Code Analysis to Recognize Algorithms: An Experiment with Sorting Algorithms
نویسنده
چکیده
We discuss algorithm recognition (AR) and present a method for recognizing algorithms automatically from Java source code. The method consists of two phases. In the first phase, the recognizable algorithms are converted into the vectors of characteristics, which are computed based on static analysis of program code, including various statistics of language constructs and analysis of Roles of Variables in the target program. In the second phase, the algorithms are classified based on these vectors using the C4.5 decision tree classifier. We demonstrate the performance of the method by applying it to sorting algorithms. Using leave-one-out cross-validation technique, we have conducted an experimental evaluation of the classification performance showing that the average classification accuracy is 98.1% (the data set consisted of five different types of sorting algorithms). The results show the applicability and usefulness of roles of variables inAR, and illustrate that the C4.5 algorithm is a suitable decision tree classifier for our purpose. The limitations of the method are also discussed.
منابع مشابه
Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images
Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...
متن کاملUsing Data Mining and Three Decision Tree Algorithms to Optimize the Repair and Maintenance Process
The purpose of this research is to predict the failure of devices using a data mining tool. For this purpose, at the outset, an appropriate database consists of 392 records of ongoing failures in a pharmaceutical company in 1394, in the next step, by analyzing 9 characteristics and type of failure as a database class, analyzes have been used. In this regard, three decision tree algorithms have ...
متن کاملImproving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran
An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملEarly Prediction of Gestational Diabetes Using Decision Tree and Artificial Neural Network Algorithms
Introduction: Gestational diabetes is associated with many short-term and long-term complications in mothers and newborns; hence, the detection of its risk factors can contribute to the timely diagnosis and prevention of relevant complications. The present study aimed to design and compare Gestational diabetes mellitus (GDM) prediction models using artificial intelligence algorithms. Materials ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comput. J.
دوره 54 شماره
صفحات -
تاریخ انتشار 2011